perm filename 0[00,BGB]1 blob sn#043252 filedate 1973-05-22 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00006 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	STANFORD ARTIFICIAL INTELLIGENCE LABORATORY	            MAY 1973.
C00005 00003	PAge 1. Illustration: 16 frames of a doll baby on a turn table.
C00006 00004	PAGE 2. INTRODUCTION.
C00015 00005	PAGE 3. Illustration: 32 frames of blocks on a turn table.
C00016 00006	PAGE 4. INTRODUCTION.
C00020 ENDMK
CāŠ—;
STANFORD ARTIFICIAL INTELLIGENCE LABORATORY	            MAY 1973.
MEMO AIM-199

COMPUTER SCIENCE DEPARTMENT REPORT
NO. CS-362


                    Image Contouring and Comparing.


                          Bruce g. Baumgart


ABSTRACT:

	A contour  image representation  is stated  and an  algorithm
for  converting  a   set  of  digital  television  images  into  this
representation is explained.   The algorithm consists of five  steps:
digital  image  thresholding,    binary image  contouring,    polygon
nesting,     polygon   smoothing,     and  polygon  comparing.     An
implementation of  the algorithm  is the  main routine  of a  program
called CRE;  auxiliary routines provide cart and  turn table control,
TV camera  input,   image  display,    and xerox  printer  output.  A
serendip application  of CRE to  type font contruction  is explained.
Details  about the intended application  of CRE to  the perception of
physical objects will appear in sequels to this paper.



	CONTENTS:

		     Introduction.

		  I. The CRE data structure.

		 II. The CRE algorithm.

		III. Using CRE.

		 IV. Using TVFONT.

		     Postscripts.



This  research was supported by the Advanced Research Projects Agency
   of the Office of the Secretary of Defense under contract SD-183.
PAge 1. Illustration: 16 frames of a doll baby on a turn table.
PAGE 2. INTRODUCTION.

	The acronym CRE  stands both for "Contour, Region,  Edge" and
for  "Cart's  Eye". CRE  is  a  solution to  the  problem  of finding
contour  edges  in   a  set  of   television  pictures  and   linking
corresponding edges  from one picture  to the  next.  The  process is
automatic   and  is  intended  to  run  without  human  intervention.
Furthermore,   the process  is bottom  up; there  are no  significant
inputs other than the given  television images.  The output of CRE is
a 2D  contour map  data structure  which is  suitable input  to a  3D
geometric modeling program.

	The overall design  goal for CRE  was to build a  region edge
finding  program that could  be applied  to a sequence  of television
pictures and that  would output a sequence  of line drawings  without
having to know anything about  the content of the images. Furthermore
it was  desired that the line drawings be structured.  The six design
choices that determined the character of CRE are:

	1. Dumb vision rather than model driven vision.
	2. Multi image analysis rather than single image analysis.
	3. Total image structure imposed on edge finding; rather
	   than separate edge finder and image analyzer.
	4. Automatic rather than interactive.
	5. Fixed image window size rather than variable window size.
	6. Machine language rather than higher level language.

	The design  choices are  ordered from  the more strategic  to
the   more  tactical;   the  first   three  choices   being  research
strategies, the  latter  three  choices  being  programming  tactics.
Adopting these  design choices lead  to image contouring  and contour
map structures similar to that of Krakauer[3] and Zahn[4].

	The first design  choice does not  refer to the issue  of how
model dependent a  finished general vision system will be (it will be
quite model dependent),  but rather to the issue of where  one should
begin  building such  a  system.  I believe  that  the best  starting
points are  at the two apparent extremes of nearly total knowledge of
a particular  visual  world or  nearly  total ignorance.  The  former
extreme involves  synthesis (by computer graphics) of  a predicted 2D
image, followed by comparing the predicted and a perceived image  for
slight  differences which  are  expected but  not  yet measured.  The
other  extreme involves  anaylsing  perceived images  into structures
which can be  readily compared (for near  equality) and measured  for
slight differences;  followed by the  construction of a  3D geometric
model of  the perceived world. The point is that in both cases images
are compared,  and in both cases the 3D  model initially (or finally)
contains specific  numerical data on the geometry  and physics of the
particular world being looked at.
PAGE 3. Illustration: 32 frames of blocks on a turn table.
PAGE 4. INTRODUCTION.

	The  second design  choice, of  multi  image anaylsis  rather
than single  image analysis, provides a basis  for solving for camera
positions and feature  depths.   The third design  choice solves  (or
rather avoids)  the problem of  integrating an edge  finder's results
into  an image; by using  a very simple edge  finder and by accepting
all its findings  as valid photometric edges  that are indeed a  part
of  the image,   the  image  structure is  never lost.    This design
postpones the problem of  interpreting photometric edges as  physical
edges.

	The fourth  choice  is a  resolution not  to  write an  image
processor  that requires  operator  assistance and  parameter tuning.
The fifth choice of the  216 by 288 fixed  window size is a sin  that
proved  surprisingly expedient,  it  is explained  later. A  variable
window  version of CRE at halves,   thirds and other simple fractions
of its present window size will be made at some future date.

	The  final design choice  of using  machine language  was for
the  sake  of  implementing  node  link  data  structures  that   are
processed 100 faster  than LEAP, 10  times faster than  compiled LISP
and  that require significantly  less memory  than similar structures
in either LISP or LEAP. Furthermore machine code assembles  and loads
faster  than  higher  level  languages;   and  machine  code  can  be
extensively fixed and altered without recompiling.

	It  is  my  impression  that  CRE  does  not  raise  any  new
scientific problems;  nor does it  have any  really new solutions  to
the old  problems; rather CRE is another  competent video region edge
finding program with its  own set of tricks.  However, it is  further
my impression that the particular tricks for  smoothing,  nesting and
comparing polygons in CRE are original as programming techniques.

	The  intended use of CRE  is illustrated by  the sequences of
turn table pictures on pages 1  and 3. The first results of  applying
CRE to typography is illustrated below:

Christmas Font

	A B C D E F G H I J K L M N O P Q R S T U V W X Y Z.
	a b c d e f g h i j k l m n o p q r s t u v w x y z.

ACKNOWLEDGEMENT.

	Tovar Mock assisted me with  the development of the type font
making program,   TVFONT.